Search CORE

7 research outputs found

ILU Smoothers for AMG with Scaled Triangular Factors

Author: Carr Arielle
Day Marc
Mullowney Paul
Thomas Stephen
Świrydowicz Kasia
Publication venue
Publication date: 03/08/2022
Field of study

ILU smoothers are effective in the algebraic multigrid (AMG) V-cycle for reducing high-frequency components of the residual error. However, direct triangular solves are comparatively slow on GPUs. Previous work by Chow and Patel (2015) and Antz et al. (2015) demonstrated the advantages of Jacobi relaxation as an alternative. Depending on the threshold and fill-level parameters chosen, the factors are highly non-normal and Jacobi is unlikely to converge in a low number of iterations. The Ruiz algorithm applies row or row/column scaling to U in order to reduce the departure from normality. The inherently sequential solve is replaced with a Richardson iteration. There are several advantages beyond the lower compute time. Scaling is performed locally for a diagonal block of the global matrix because it is applied directly to the factor. An ILUT Schur complement smoother maintains a constant GMRES iteration count as the number of MPI ranks increases and thus parallel strong-scaling is improved. The new algorithms are included in hypre, and achieve improved time to solution for several Exascale applications, including the Nalu-Wind and PeleLM pressure solvers. For large problem sizes, GMRES+AMG with iterative triangular solves execute at least five times faster than with direct on massively-parallel GPUs.Comment: v2 updated citation information; v3 updated results; v4 abstract updated, new results added; v5 new experimental analysis and results adde

arXiv.org e-Print Archive

GPU-resident sparse direct linear solvers for alternating current optimal power flow analysis

Author: Abhyankar Shrirang
Anzt Hartwig
Göbel Fritz
Koukpaizan Nicholson
Peleš Slaven
Ribizel Tobias
Świrydowicz Kasia
Publication venue: Elsevier
Publication date: 21/11/2023
Field of study

Integrating renewable resources within the transmission grid at a wide scale poses significant challenges for economic dispatch as it requires analysis with more optimization parameters, constraints, and sources of uncertainty. This motivates the investigation of more efficient computational methods, especially those for solving the underlying linear systems, which typically take more than half of the overall computation time. In this paper, we present our work on sparse linear solvers that take advantage of hardware accelerators, such as graphical processing units (GPUs), and improve the overall performance when used within economic dispatch computations. We treat the problems as sparse, which allows for faster execution but also makes the implementation of numerical methods more challenging. We present the first GPU-native sparse direct solver that can execute on both AMD and NVIDIA GPUs. We demonstrate significant performance improvements when using high-performance linear solvers within alternating current optimal power flow (ACOPF) analysis. Furthermore, we demonstrate the feasibility of getting significant performance improvements by executing the entire computation on GPU-based hardware. Finally, we identify outstanding research issues and opportunities for even better utilization of heterogeneous systems, including those equipped with GPUs

KITopen

GPU-Resident Sparse Direct Linear Solvers for Alternating Current Optimal Power Flow Analysis

Author: Abhyankar Shrirang
Anzt Hartwig
Göbel Fritz
Koukpaizan Nicholson
Peleš Slaven
Ribizel Tobias
Świrydowicz Kasia
Publication venue
Publication date: 15/08/2023
Field of study

arXiv.org e-Print Archive

Acceleration of tensor-product operations for high-order finite element methods

Author: Ali Karakus
Dziekoński A
Göddeke D
Kasia Świrydowicz
Lo YJ
Noel Chalmers
Stratton JA
Tim Warburton
Volkov V
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref

Scalability of high-performance PDE solvers

Author: Brown Jed
Camier Jean-Sylvain
Dobrev Veselin
Dutta Som
Fischer Paul
Kolev Tzanio
Kronbichler Martin
Min Misun
Rathnayake Thilina
Warburton Tim
Świrydowicz Kasia
Publication venue: 'SAGE Publications'
Publication date: 01/01/2020
Field of study

Performance tests and analyses are critical to effective HPC software development and are central components in the design and implementation of computational algorithms for achieving faster simulations on existing and future computing architectures for large-scale application problems. In this paper, we explore performance and space-time trade-offs for important compute-intensive kernels of large-scale numerical solvers for PDEs that govern a wide range of physical applications. We consider a sequence of PDE- motivated bake-off problems designed to establish best practices for efficient high-order simulations across a variety of codes and platforms. We measure peak performance (degrees of freedom per second) on a fixed number of nodes and identify effective code optimization strategies for each architecture. In addition to peak performance, we identify the minimum time to solution at 80% parallel efficiency. The performance analysis is based on spectral and p-type finite elements but is equally applicable to a broad spectrum of numerical PDE discretizations, including finite difference, finite volume, and h-type finite elements.Comment: 25 pages, 54 figure

arXiv.org e-Print Archive

OPUS Augsburg

Scalability of high-performance PDE solvers

Author: Fischer P
Jean-Sylvain Camier
Jed Brown
Kasia Świrydowicz
Martin Kronbichler
Misun Min
Paul Fischer
Som Dutta
Thilina Rathnayake
Tim Warburton
Tzanio Kolev
Veselin Dobrev
Publication venue: 'SAGE Publications'
Publication date
Field of study

Crossref